Goto

Collaborating Authors

 new customer


Improving After-sales Service: Deep Reinforcement Learning for Dynamic Time Slot Assignment with Commitments and Customer Preferences

Mao, Xiao, Schrotenboer, Albert H., Wu, Guohua, van Jaarsveld, Willem

arXiv.org Artificial Intelligence

Problem definition: For original equipment manufacturers (OEMs), high-tech maintenance is a strategic component in after-sales services, involving close coordination between customers and service engineers. Each customer suggests several time slots for their maintenance task, from which the OEM must select one. This decision needs to be made promptly to support customers' planning. At the end of each day, routes for service engineers are planned to fulfill the tasks scheduled for the following day. We study this hierarchical and sequential decision-making problem-the Dynamic Time Slot Assignment Problem with Commitments and Customer Preferences (DTSAP-CCP)-in this paper. Methodology/results: Two distinct approaches are proposed: 1) an attention-based deep reinforcement learning with rollout execution (ADRL-RE) and 2) a scenario-based planning approach (SBP). The ADRL-RE combines a well-trained attention-based neural network with a rollout framework for online trajectory simulation. To support the training, we develop a neural heuristic solver that provides rapid route planning solutions, enabling efficient learning in complex combinatorial settings. The SBP approach samples several scenarios to guide the time slot assignment. Numerical experiments demonstrate the superiority of ADRL-RE and the stability of SBP compared to both rule-based and rollout-based approaches. Furthermore, the strong practicality of ADRL-RE is verified in a case study of after-sales service for large medical equipment. Implications: This study provides OEMs with practical decision-support tools for dynamic maintenance scheduling, balancing customer preferences and operational efficiency. In particular, our ADRL-RE shows strong real-world potential, supporting timely and customer-aligned maintenance scheduling.


Single-agent or Multi-agent Systems? Why Not Both?

Gao, Mingyan, Li, Yanzi, Liu, Banruo, Yu, Yifan, Wang, Phillip, Lin, Ching-Yu, Lai, Fan

arXiv.org Artificial Intelligence

Multi-agent systems (MAS) decompose complex tasks and delegate subtasks to different large language model (LLM) agents and tools. Prior studies have reported the superior accuracy performance of MAS across diverse domains, enabled by long-horizon context tracking and error correction through role-specific agents. However, the design and deployment of MAS incur higher complexity and runtime cost compared to single-agent systems (SAS). Meanwhile, frontier LLMs, such as OpenAI-o3 and Gemini-2.5-Pro, have rapidly advanced in long-context reasoning, memory retention, and tool usage, mitigating many limitations that originally motivated MAS designs. In this paper, we conduct an extensive empirical study comparing MAS and SAS across various popular agentic applications. We find that the benefits of MAS over SAS diminish as LLM capabilities improve, and we propose efficient mechanisms to pinpoint the error-prone agent in MAS. Furthermore, the performance discrepancy between MAS and SAS motivates our design of a hybrid agentic paradigm, request cascading between MAS and SAS, to improve both efficiency and capability. Our design improves accuracy by 1.1-12% while reducing deployment costs by up to 20% across various agentic applications.


Why Do Multi-Agent LLM Systems Fail?

Cemri, Mert, Pan, Melissa Z., Yang, Shuyi, Agrawal, Lakshya A., Chopra, Bhavya, Tiwari, Rishabh, Keutzer, Kurt, Parameswaran, Aditya, Klein, Dan, Ramchandran, Kannan, Zaharia, Matei, Gonzalez, Joseph E., Stoica, Ion

arXiv.org Artificial Intelligence

Despite growing enthusiasm for Multi-Agent Systems (MAS), where multiple LLM agents collaborate to accomplish tasks, their performance gains across popular benchmarks remain minimal compared to single-agent frameworks. This gap highlights the need to analyze the challenges hindering MAS effectiveness. In this paper, we present the first comprehensive study of MAS challenges. We analyze five popular MAS frameworks across over 150 tasks, involving six expert human annotators. We identify 14 unique failure modes and propose a comprehensive taxonomy applicable to various MAS frameworks. This taxonomy emerges iteratively from agreements among three expert annotators per study, achieving a Cohen's Kappa score of 0.88. These fine-grained failure modes are organized into 3 categories, (i) specification and system design failures, (ii) inter-agent misalignment, and (iii) task verification and termination. To support scalable evaluation, we integrate MASFT with LLM-as-a-Judge. We also explore if identified failures could be easily prevented by proposing two interventions: improved specification of agent roles and enhanced orchestration strategies. Our findings reveal that identified failures require more complex solutions, highlighting a clear roadmap for future research. We open-source our dataset and LLM annotator.


A Deep Learning Approach for Imbalanced Tabular Data in Advertiser Prospecting: A Case of Direct Mail Prospecting

Farhang, Sadegh, Hayes, William, Murphy, Nick, Neddenriep, Jonathan, Tyris, Nicholas

arXiv.org Artificial Intelligence

Acquiring new customers is a vital process for growing businesses. Prospecting is the process of identifying and marketing to potential customers using methods ranging from online digital advertising, linear television, out of home, and direct mail. Despite the rapid growth in digital advertising (particularly social and search), research shows that direct mail remains one of the most effective ways to acquire new customers. However, there is a notable gap in the application of modern machine learning techniques within the direct mail space, which could significantly enhance targeting and personalization strategies. Methodologies deployed through direct mail are the focus of this paper. In this paper, we propose a supervised learning approach for identifying new customers, i.e., prospecting, which comprises how we define labels for our data and rank potential customers. The casting of prospecting to a supervised learning problem leads to imbalanced tabular data. The current state-of-the-art approach for tabular data is an ensemble of tree-based methods like random forest and XGBoost. We propose a deep learning framework for tabular imbalanced data. This framework is designed to tackle large imbalanced datasets with vast number of numerical and categorical features. Our framework comprises two components: an autoencoder and a feed-forward neural network. We demonstrate the effectiveness of our framework through a transparent real-world case study of prospecting in direct mail advertising. Our results show that our proposed deep learning framework outperforms the state of the art tree-based random forest approach when applied in the real-world.


Taxi dispatching strategies with compensations

Billhardt, Holger, Fernández, Alberto, Ossowski, Sascha, Palanca, Javier, Bajo, Javier

arXiv.org Artificial Intelligence

Urban mobility efficiency is of utmost importance in big cities. Taxi vehicles are key elements in daily traffic activity. The advance of ICT and geo-positioning systems has given rise to new opportunities for improving the efficiency of taxi fleets in terms of waiting times of passengers, cost and time for drivers, traffic density, CO2 emissions, etc., by using more informed, intelligent dispatching. Still, the explicit spatial and temporal components, as well as the scale and, in particular, the dynamicity of the problem of pairing passengers and taxis in big towns, render traditional approaches for solving standard assignment problem useless for this purpose, and call for intelligent approximation strategies based on domain-specific heuristics. Furthermore, taxi drivers are often autonomous actors and may not agree to participate in assignments that, though globally efficient, may not be sufficently beneficial for them individually. This paper presents a new heuristic algorithm for taxi assignment to customers that considers taxi reassignments if this may lead to globally better solutions. In addition, as such new assignments may reduce the expected revenues of individual drivers, we propose an economic compensation scheme to make individually rational drivers agree to proposed modifications in their assigned clients. We carried out a set of experiments, where several commonly used assignment strategies are compared to three different instantiations of our heuristic algorithm. The results indicate that our proposal has the potential to reduce customer waiting times in fleets of autonomous taxis, while being also beneficial from an economic point of view.


It's not all about scores. Other criteria you should consider…

#artificialintelligence

As a data scientist or machine learning engineer, you spend much of your time improving a model's performance by creating new features, comparing different types of models, trying out new model architectures, and much more. In the end, it's the score on the test set that counts, so that is what you focus on when deciding on a model. However, as important as the model performance may be, there are other, secondary criteria you shouldn't forget about. What do you get from a model with almost perfect scores, if your MLOps department can't host it? How does the user feel, if the prediction is accurate, but it takes ages to get it?


The Hottest Startups in Lisbon

WIRED

Serial entrepreneurs Mila Suharev, Nils Henning, and Mitya Moskalchuk had been involved in the German startup scene for more than 15 years, successfully exiting four companies with valuations above €100 million (around $98.5 million) before deciding to launch their new startup in the Portuguese capital. "Lisbon has several ingredients making it a unique and efficient tech ecosystem," says Suharev, CEO of proptech company CASAFARI, listing factors such as quality of life, governmental programs designed to attract foreign entrepreneurs, and its Silicon Valley-like business mindset. Lisbon is increasingly becoming the tech hub of choice for many European entrepreneurs: Of the 10 CEOs profiled here, half are expats. "A new ecosystem such as the one growing in Lisbon is fascinating to experience firsthand," says Amir Bozorgzadeh, CEO of Virtualeap. "It is a melting pot of foreigners and Portuguese, working hand-in-hand amid a very sunny setting in which work-life balance is always a priority for founders."


Bank Customer Churn Prediction Using Machine Learning

#artificialintelligence

This article was published as a part of the Data Science Blogathon. Customer Churn prediction means knowing which customers are likely to leave or unsubscribe from your service. For many companies, this is an important prediction. This is because acquiring new customers often costs more than retaining existing ones. Once you've identified customers at risk of churn, you need to know exactly what marketing efforts you should make with each customer to maximize their likelihood of staying.


Using PROC DEEPCAUSAL to optimize revenue through policy evaluation

#artificialintelligence

When it comes to causal inference, scoring capability is particularly beneficial. It can be used in unique ways that result in an improved decision-making process, such as gaining optimal revenue using the least number of resources. In this post, I will introduce to you a new scoring capability and its use cases with PROC DEEPCAUSAL. I will also show you how it utilizes Deep Neural Networks (DNNs) to perform causal inference as well as policy evaluation and comparison. Inference is not valid for the estimators when the estimates from machine learning methods are directly plugged into an econometric model. This way creates highly biased estimators, so econometrics methods need to correct for this bias.


Microsoft to restrict access to AI now deemed too risky

#artificialintelligence

Microsoft has pledged to clamp down on access to AI tools designed to predict emotions, gender, and age from images, and will restrict the usage of its facial recognition and generative audio models in Azure. The Windows giant made the promise on Tuesday while also sharing its so-called Responsible AI Standard, a document [PDF] in which the US corporation vowed to minimize any harm inflicted by its machine-learning software. This pledge included assurances that the biz will assess the impact of its technologies, document models' data and capabilities, and enforce stricter use guidelines. This is needed because – and let's just check the notes here – there are apparently not enough laws yet regulating machine-learning technology use. Thus, in the absence of this legislation, Microsoft will just have to force itself to do the right thing.